feat: add token usage and cost display to model responses#7
Merged
Conversation
Adds SHOW_COST_INFO and COST_CURRENCY valves that, when enabled, append a formatted footnote to each response showing token counts (prompt, completion, total) and cost in the selected currency symbol. OpenRouter includes usage data in every API response — in the JSON body for non-streaming and in the final SSE chunk before [DONE] for streaming. Both paths now capture and surface this data when SHOW_COST_INFO is true. Cost formatting uses adaptive precision: 6 decimals for micro-costs (<0.0001), 5 for small (<0.01), 4 for normal amounts. Token counts use comma separators for readability. COST_CURRENCY is a display-only label (USD/EUR/GBP/JPY/CAD/AUD); OpenRouter always bills in USD. Adds 52 new assertions (374 total, all passing). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds an optional “token usage + cost” footnote to OpenRouter model responses (both streaming and non-streaming) so Open WebUI users can see per-turn usage/cost directly in-chat.
Changes:
- Introduces
SHOW_COST_INFOandCOST_CURRENCYvalves and a_format_cost_info()formatter with a_CURRENCY_SYMBOLSmap. - Appends formatted usage/cost info to non-stream responses when enabled.
- Tracks the latest
usageSSE chunk and appends the formatted footnote after streaming completes.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| openrouter_pipe.py | Adds valves + formatting helper; appends token/cost footnote in both non-stream and stream flows. |
| test_pipe.py | Adds unit/integration assertions covering formatting and valve behavior for both response modes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if cost is not None: | ||
| try: | ||
| cost_f = float(cost) | ||
| symbol = _CURRENCY_SYMBOLS.get(currency, f"{currency} ") |
| description="Append token usage and cost to each response", | ||
| ) | ||
| COST_CURRENCY: str = Field( | ||
| default=os.getenv("OPENROUTER_COST_CURRENCY", "USD"), |
| _assert(v.REQUEST_TIMEOUT == 90, "REQUEST_TIMEOUT 90") | ||
| _assert(v.MAX_RETRIES == 2, "MAX_RETRIES 2") | ||
| _assert(v.SHOW_COST_INFO is False, "SHOW_COST_INFO false by default") | ||
| _assert(v.COST_CURRENCY == "USD", "COST_CURRENCY USD by default") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SHOW_COST_INFOvalve (defaultfalse) — when enabled, a footnote is appended to every model response showing token usage and costCOST_CURRENCYvalve (select: USD/EUR/GBP/JPY/CAD/AUD, defaultUSD) — controls the currency symbol shown next to the cost_format_cost_info()helper and_CURRENCY_SYMBOLSdict at module levelWhat changed and why
OpenRouter already includes a
usageobject in every response (prompt_tokens,completion_tokens,total_tokens,cost). Until now the pipe discarded this data entirely. This PR surfaces it as an optional footnote after each reply, letting users see exactly what each conversation turn costs without leaving Open WebUI.Non-streaming:
usageis read from the final JSON response and appended tofinal_partswhenSHOW_COST_INFOis true.Streaming: a
latest_usagevariable tracks the usage chunk as SSE lines are consumed (OpenRouter sends it in the last chunk before[DONE]); after the stream ends, the footnote is yielded.Cost formatting uses adaptive precision:
< 0.0001→ 6 decimal places ($0.000005)0.0001–0.01→ 5 decimal places ($0.00123)>= 0.01→ 4 decimal places ($0.0567)COST_CURRENCYis a display-only label — OpenRouter always bills in USD, so no exchange-rate API call is needed.Example output with
SHOW_COST_INFO=true,COST_CURRENCY=USD:Test plan
python test_pipe.py— 374 assertions, 0 failures (52 new assertions in section 33)_format_cost_info()tested for: empty dict, tokens-only, zero cost, micro/small/normal costs, EUR/GBP/unknown currency, invalid cost value, missingtotal_tokens, output format/separators, large numbers with commasSHOW_COST_INFO=false(no footnote),SHOW_COST_INFO=trueUSD/EUR, nousagefield in response (no crash)SHOW_COST_INFO=false, no usage chunk at all (no crash)SHOW_COST_INFO=false,COST_CURRENCY=USD) and select schema (6 options)🤖 Generated with Claude Code